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1.  Introduction. 


In  today's  complex  world  an  understanding  of  the  impact  of  modelling 
assumptions  upon  optimum  military  strategies  derived  from  mathematical 
models  is  essential  for  the  determining  of  optimal  solutions  to  complex 
problems  of  international  significance.  In  tills  paper  we  continue  the 
study  of  one  of  the  authors  on  the  effects  of  various  modelling  assumptions 
on  the  structure  of  optimal  tactical  allocation  policies  by  systematically 
contrasting  the  solutions  for  a sequence  of  idealized  models.  These  combat 
models  are  too  simple  to  be  taken  literally  but  should  be  interpreted  as 
indicating  general  principles  to  serve  as  hypotheses  for  subsequent  higher 
resolution  studies  of  real  world  problems  via  computer  simulation  or  field 
experimentation . 

In  previous  papers  [34],  [35],  [36],  [37],  [38]  one  of  us  has 
studied  the  optimal  control  of  deterministic  Lanchester  attrition  processes. 
A major  result  of  this  previous  research  was  that  optimal  tactical  alloca- 
tion policies  are  quite  sensitive  to  the  precise  nature  of  the  combat 
model  adopted,  even  as  to  whether  the  tactical  scenario  lasts  for  a 
specified  period  of  time  or  terminates  only  when  a predetermined  "break- 
point" has  been  reached.  We  have  shown  [36]  that  whether  or  not  concentra- 
tion of  all  fire  on  a single  enemy  target  type  is  always  the  optimal  fire 
distribution  policy  depends  on  whether,  tor  example,  enemy  target  types 
undergo  a "square-law"  or  "linear-law"  attrition  process  (see  also  [38]). 

In  the  paper  at  hand,  we  examine  the  effects  on  the  structure  of  the 
optimal  fire  distribution  policy  of  whether  combat  attrition  is  modelled 
as  a deterministic  or  a stochastic  process.  Although  there  has  been  a 
continuing  discussion  among  military  operations  analysts  about  the  relative 
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merits  of  deterministic'  versus  stochastic  combat  attrition  models  (in 
particular,  see  [4],  [9]),  there  apparently  has  been  no  systematic  attempt 
to  contrast  optimal  military  strategies  derived  from  such  different 
modelling  approaches. 

In  order  to  keep  the  impact  of  modelling  assumptions  on  optimal 
strategies  in  sharp  focus  and  also  for  reasons  of  mathematical  tractability , 
we  consider  a simple  fire  distribution  problem  for  a homogeneous  Y force 
in  Lanchester  combat  against  heterogeneous  X forces  composed  of  two 
types  of  weapon  systems.  Out  research  approach  is  to  study  the  same 
scenario  (prescribed  duration  battle)  using  a deterministic  combat  attri- 
tion model  and  also  a stochastic  one  and  then  to  compare  the  corresponding 
optimal  fire  distribution  policies. 

The  solution  to  the  deterministic  problem  is  obtained  using  modern 
optimal  control  theory  (see  [8],  [27]).  As  discussed  in  [37]  an'd  [41], 
the  non-negativity  restrictions  on  the  force  levels  are  state  variable 
inequality  constraints  (henceforth  abbreviated  as  SVIC's)  and  require 
special  treatment  (appropriate  modification  of  the  usual  maximum  principle^) 
when  active  (see  Chapter  6 of  [27],  [40]).  In  this  paper  we  shall  treat 
SVIC's  by  the  method  of  Speyer  and  Bryson  [32]  (see  also  [19],  [24])  of 
adjoining  an  SVIC  directly  to  the  return  functional  with  a (Lagrange) 
multiplier  (see  [41]).  Unlike  the  corresponding  terminal  control  problem 
studied  in  [34],  however,  this  "solution"  requires  several  computer  assisted 
computations  for  implementation. 

The  solution  to  the  stocnastic  problem  is  obtained  using  the  well- 
known  dynamic  programming  approach  to  optimal  stochastic  control  113],  [21], 

In  this  paper  we  employ  an  equivalent  statement  of  the  Pontryagin  maximum 
principle  [27]  commonly  used  by  engineers  in  the  United  States.  There  is  a 
minor  sign  difierence  (see  p.  108  of  [8])  between  these  versions. 
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[12J.  The  basic  equations  of  optimality  (the  fundamental  functional 
equation  for  the  optimal  expected-value  function  (see  [12]))  are  developed. 
We  derive  analytic  solutions  to  these  equations  for  very  small  numbers  of 
combatants  and  thus  obtain  the  optimal  closed-loop  control.  As  is  the 
case  for  the  Lanchester  stochastic  process  (see  [9],  [20]),  a general 
solution  for  arbitrary  numbers  of  combatants  has  not  been  obtained  for 
the  fundamental  functional  equation  (actually  a system  of  differential- 
difference  equations),  although  solutions  for  specific  (small)  numbers  of 
combatants  are  readily  obtained.  Therefore,  we  have  used  finite-difference 
methods  to  generate  a numerical  approximate  solution. 

The  body  of  this  paper  is  organized  in  the  following  fashion,  first, 
we  review  a few  relevant  facts  about  the  Lanchester  stochastic  process. 

Then  we  state  the  two  optimal  control  problems  that  this  paper  compares. 

The  method  of  solving  the  deterministic  problem  is  outlined.  The  basic 
equations  of  optimality  for  the  stochastic  control  problem  are  developed, 
and  obtaining  an  analytic  solution  to  these  equations  is  discussed.  The 
use  of  finite  difference  methods  for  generating  a numerical  solution  is 
described.  Then  we  compare  results  obtained  from  the  two  models  and  dis- 
cuss these  results.  The  implications  of  these  results  for  defense  planners 
and  military  operations  analysts  are  pointed  out. 

2 The  Lanchester  Stochastic  Process  . 

In  1914  in  the  British  journ  •'  Engineering  F.  W.  Lanchester  [23] 
postuiated  that  under  the  conditio’  of  "modern  warfare"  combat  between  two 
homogeneous  forces  could  be  described  by  the  equations^ 

See  [45J  for  a discussion  of  the  assumptions  inherent  f.n  (1).  A further 
discussion  of  Lanchester-type  equations  of  warfare  can  be  found  in  139]. 
Further  references  on  determinls  ! Lanchester  formulations  can  be  found 
there  [39]  or  in  [11]. 
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dx 

dt 

1 1 

dt 


-ay. 


-bx. 


(1) 


where  a,b  are  commonly  referred  to  as  the  Lanchester  attrition-rate 
coefficients  and  x(t),y(t)  are  force  levels.  During  World  War  II, 

B.  Koopman  suggested  a reformulation  of  such  a model  in  stochastic  form 
[25].  Subsequent  work  on  stochastic  models  of  combat  attrition  has  been 
by  R.  Snow  [31],  R.  Brown  [6],  [7],  G.  Weiss  [44],  D.  Smith  [30],  and 
G.  Clark  [9].  The  stochastic  process  corresponding  to  a model  like  (1) 
has  been  called  the  Lanchester  stochastic  process  by  B.  Koopman  [20]. 

Before  considering  the  optimal  stochastic  control  problem,  it  seems 
appropriate  for  us  to  review  a few  results  for  the  Lanchester  stochastic 
process.  Consider  combat  between  a homogeneous  X force  and  a homogeneous 
Y force.  Let  us  model  this  combat  as  a continuous  parameter  Markov  chain 
with  stationary  transition  probabilities  (see  pp.  188-189  of  [26]  for  a 
further  discussion  of  terminology) . Let  M(t)  denote  the  (integer) 
number  of  X combatants  "alive"  at  time  t after  the  battle  begins,  and 
let  N(t)  denote  the  number  of  Y combatants.^  We  denote  the  state  proba- 
bility by  P(t,m,n),  and  thus 


P(t,m,n)  » Prob[M(t)=m,N(t)*n] . 


Making 

satisfy 


standard  assumptions  (see  [5]),  we  find  that  the  state  probabilit 
the  following  system  of  differential-difference  equations 
for  lsmstm^  and  1 £ n £ nj 


ies 


'^Random  variables  are  denoted  by  capital  letters,  while  their  realizations 
are  denoted  by  the  corresponding  lower  case  letters. 

^ We  adopt  the  convention  that  P(t,m,n)  - 0 for  either  m > m^  or  n > n^. 
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^~(t  ,m,n)  = P(t  ,m+l  ,n)A(in+l  ,n)  + P ( t ,m ,n+l)  B(m ,rrfl) 


-{A(m,n)  + B(m,n)}P(t,m,n) , (2) 


where  mQ  (nQ)  is  the  number  of  X (Y)  combatants  at  the  beginning  of 
battle  at  t » 0,  i.e.  M(t-O)  = mQ  with  probability  one;  A(m,n)  is 
the  rate  of  attrition  of  the  X forces  with  A(0,n)  - 0;  and  B(m,n) 
is  the  rate  of  attrition  of  the  Y forces  with  B(m,0)  * 0.  In  other 
words,  we  have 


Prob 


one  X casualty  in  time 
interval  from  t to  t + At 


= A(m,n)At. 


(Moreover,  P(t,m,n)  is,  more  precisely,  the  transition  probability 


P(t,in,n)  = P(t,m,n;t=0,m0,n0)  - Prob 


[M(t)=m 

[N(t)=n 


M(t=0)=m 

N(t=0)=n' 


.) 


Of  course,  the  state  space  is  discrete,  i.e.  m * 0,1,..., m^  and 

n = 0,1 n^.  At  state  space  boundaries,  i.e.  m = 0 or  n = 0, 

equation  (2)  takes  the  form 


dP 


— (t,m,0)  = P (t  ,nri-l  ,0)A(nH-l  ,0)  + P(t,m,l)B(m,l) 

- P(t,m,0)A(m,0) , 


dP 

— (t,0,n)  = P(t,0,n+l)B(0,n+l)  + P ( t , 1 ,n)A (1 ,n) 

- P(t,0,n)B(0,n) , 


dP 

dt 


(t,0,0)  = P ( t , 1 ,0) A(1 ,0)  + P(t,Q,l)B(0,l) 


(3) 


Initial  conditions  for  (2)  and  (3)  are 

1 f 

0 otherw-.se. 


P(t=*0,m,n)  = | 


o*-  m * niQ , n = nQ. 


(A) 


L 
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Let  us  adopt  the  following  terminology  for  the  attrition  rates  I 

(and  hence  the  process  itself) . We  say  that  we  have  a 
(a)  linear-law  attrition  process  when 

A(ni,n)  = amn, 

B(m,n)  = bmn,  (5) 

and  (b)  square-law  attrition  process  when 

A(m,n)  ■ 8m  + an, 

B(m,n)  = bm  + an,  (6) 

where  a,0  may  be  referred  to  as  operational  loss  rates. 

Although  it  is  well-known  that  (2)  through  (4)  yield  an  exponential 

solution  (the  Chapman-Kolmogorov  equation  expresses  the  semi-group  property 

of  the  state  probabilities  (see  [20]))  when  A(m,n)  and  B(m,n)  have 

been  specified  (for  example,  by  (6)),  general  solutions  which  apply  for 

all  values  of  mn  and  n,.  have  only  been  obtained  to  this  system  only 

U U 4 

in  a few  special  cases.  In  the  special  case  when  a + a = b + (3,  Isbell 
and  Marlow  [18]  developed  a general  solution  to  (2)  through  (4)  for  a 
square-law  stochastic  attrition  process.  Recently,  Clark  (see  pp.  102-104 
of  [9])  developed  the  general  solution  to  the  linear-law  stochastic 
attrition  process  (i.e.  A(m,n)  and  B(m,n)  are  given  by  (5)). 

One  reason  why  we  have  reviewed  this  material  is  to  now  point  out 
to  the  reader  that  a general  solution  to  (2)  through  (4)  only  exists  for 
a linear-law  attrition  process  and  is  very  complex  (see  pp.  102-104  of  [9]). 

In  considering  the  optimal  control  of  the  Lanchester  stochastic  (square-law)  j 


process,  we  will  encounter  a similar  system  of  equations  for  the  optimal 
expected-value  function.  Keeping  in  mind  that  a general  solution  has  not 
been  obtained  to  the  corresponding  equations  (2)  through  (4)  for  the  state 
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probabilities  of  the  square-law  stochastic  attrition  process,  the  reader 
will  not  be  surprised  to  learn  that  we  have  not  developed  an  analytic  solu- 
tion for  the  general  case  of  these  equations. 

Additionally,  using  the  above  noted  solutions  for  the  Lanchester 
stochastic  process,  Clark  (following  results  in  [25]  and  qualitative  results 
in  [31])  made  comparisons  [9]  (see  also  Chapter  11  of  [A])  of  the  average 
force  levels  in  the  stochastic  process  (denoted  as  m(t)  and  n(t))  and 
the  corresponding  force  levels  x(t)  and  y(t)  in  the  deterministic 
formuiation  (such  as  (1)).  Unlike  the  corresponding  situation  for  the 
Yule-Ferry  linear  birth  process  (see  pp.  77-78  of  [3]  or  pp.  156-159  of 
[10]),  there  is  a bias  (due  to  "boundary  effects")  in  the  dynamical  behavior 
of  x(t)  and  y(t)  as  compared  with  m(t)  and  n(t)  for  the  same  values 
of  a and  b.  It  turns  out  that  m(t)  lies  above  x(t),  and  the  amount 
of  separation  grows  over  time. 

The  above  is  a major  result  of  Clark's  careful  investigation  in 
which  several  numerical  examples  are  given  to  prove  such  points.  He  con- 
cludes that  (see  p.  11-19  of  [A])  "the  deterministic  model  would  have 
difficulty  approximating  a stochastic  simulation"  with  respect  to  the  time 
history  of  force  levels.  Clark's  solution  to  the  stochastic  linear-law 
process  was  important  in  making  such  a comparison.  This  fact  that  the 
average  of  the  Lanchester  stochastic  process  does  not  behave  identically 
to  the  corresponding  force  levels  x(t)  and  y(t)  computed  according  to 
the  corresponding  deterministic  model  has  motivated  the  paper  at  hand. 
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3 . The  Optimal  Control  Problems . 

In  this  section  we  state  the  two  optimal  control  problems  that  are 
considered  in  this  paper.  The  deterministic  optimal  control  problem 
considered  is 


maximize{ry(t  ) - px  (t_)  -qx„(t,)}  with  t specified, 
a r it  2 f max  r ’ 

dt 

dx. 


♦D(t) 
subject  to: 


'Vly' 


— - = -(1-*  )a„y, 


dt 


-blxl  - b2x2, 


(7) 


Xi ,x2*y  * °» 


0 ac  ac  1, 


and  tc  ^ t , 
f max 


with  initial  conditions 


x1(t=0)  = xlt 


x2 (t=0)  = x2. 


y(t=0)  - y 


O’ 


where  all  symbols  are  explained  in  the  Appendix.  In  this  problem  x^.x.^ 
and  y are  called  state  variables,  while  is  called  a control  (or 

decision)  variable.  A constraint  such  as  x^  i 0 is  called  a state 
variable  inequality  constraint  (SVIC)  and  requires  special  treatment  (see 
below) . 

The  battle  lasts  for  0 s t s.  t unless,  of  course,  one  sfde  or 

max 

the  other  is  annihilated  before  t . To  be  more  precise,  the  battle 

max  r ’ 

terminates  under  one  of  the  three  following  circumstances: 


(1) 

x1(tf)  = X 

„(tf)  * 0 and  t s:  t , 

2 f f max 

(2) 

y(tf)  « o 

and  t,  £ t , 

i max 

(3) 

tf  ''max’ 
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where  tj  denotes  the  time  at  which  the  battle  ends.  Upon  further 
analysis,  it  has  been  convenient  to  consider  that  there  are  eight  "terminal 
states,"  or  "target  sets."  These  are  shown  in  Table  I.  The  reader  should 
note  that  for  S through  S the  battle  ends  by  the  system  (as  described 

H O 

by  the  three  state  variables  x^.x^,  and  y)  being  driven  to  a prescribed 

terminal  state.  For  these  terminal  states,  t^  is  undetermined  when 

tc  < t , since  it  is  then  determined  by  entry  to  the  terminal  state, 
f max  } 

and  this  depends  upon  the  control  used.  For  these  cases  a well-known 

transversality  condition  must  hold.  The  above  problem  (7)  is  called  a 

prescribed  duration  battle,  since  the  battle  lasts  for  a maximum  duration 

of  t , i.e.  t£  s t 

max  f max 

The  corresponding  stochastic  optimal  control  problem  considered  is 


maximize  E[rN(t^)  - pM^(t^)  - qM^ft^)]  with  t^  specified, 


subject  to:  casualties  occur  randomly  as  a continuous 

parameter  Markov  chain  with  stationary  transition 
probabilities  corresponding  to  the  deterministic 
process  (7)  , (8) 


i 0 and  0 £ ^ £ 1, 


where  the  random  variables  M^(t),  M^t),  N(t)  are  force  levels 

(integers),  E [ - ] denotes  mathematical  expectation,  and  all  other  symbols 
are  explained  in  the  Appendix.  In  /8)  <J>S  = (t  ,m^  ,m^  ,n)  denotes  a 
closed-loop  control  (see  [16]).  For  the  deterministic  problem  (7)  we 
have  not  been  precise  about  this  point,  since  it  is  well-known  that  open- 
loop  control  (e.g.  = ( t ;x° ,x° , y^) ) and  closed-loop  control 

(e.g.  <py  = k(t ,x^ ,x^ ,y) ) are  equivalent  and  yield  identical  results  in 
trajectory  and  payoff  [16].  For  stochastic  control  problems  this  equiva- 


lence is,  of  course,  not  true  (see  [12]). 
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Table  1.  Definition  of  Terminal  States  for  Deterministic 
Optimal  Control  Problem  (Prescribed  Duration 
Battle)  . 


Sl:  Xl(tf)  > °*  X2(tf)  " °*  y(tf>  " °’  "f  “ "max 

V x£(t£)  « x^tp  = 0,  x2(tf)  > 0,  y(tf)  >0,  t£  - tmax 

where  t£  < t£ 

S3:  Xl(tf)  * x1(t3>  > x2(tf^  * °*  y(tf)  > °*  "f  "max 

where  t^  < t£ 


V xl(t£)  > 0,  x2(t£)  > 0,  y(t£)  - 0,  t£  * 


Sj : Xl(tf)  * x^tj)  = °*  x0(tf)  > °»  y(tf)  =0,  tf  £ t 


2 f 


"f  max 


where  t£  < t£ 


S6:  Xi^"f^  = xi^t2^  > 0>  xo<*f>  ~ °>  = °»  cf  * t 


2 f 


'f  max 


where  t2  < t£ 


S?:  x£(t£)  = Xl(t£)  = °»  x2 ( " f ) * o,  y(tF)  >0,  tf  s;  t 


"f  max 


where  t£  < t£ 


Su:  x,(tf)  = 0,  x2  ( t^ ) = x2(t4)  = 0,  y(tf)  >0,  tf  s.  tm 


8 lv  f 


f max 


"4  < "f 


where 
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4.  Determination  of  an  Optimal  Policy  for  Deterministic  Problem. 

In  this  section  we  outline  how  an  optimal  policy  (expressed  as  a 
closed-loop  control)  may  be  determined  for  (7).  In  order  to  keep  the 
length  of  the  paper  at  hand  within  reasonable  limits  we  will  only  be  able 
to  highlight  the  main  points.  Details  which  are  available  elsewhere  in 
the  open  literature  will  be  omitted.  In  order  to  contain  the  length  of 
this  paper  the  entire  "solution"  will  not  be  given  here.^ 

4.1.  Outline  of  Solution  Procedure. 

Before  giving  our  solution  algorithm,  it  seems  appropriate  to  define 
some  terms.  We  have  then 

Definition  1:  By  an  extremal  path  we  mean  a path  on  which  the  necessary 

conditions  of  optimality  are  everywhere  satisfied  (we  use 
the  work  everywhere , since  we  take  the  class  of  admissible 
controls  to  be  the  space  of  piecewise-cont inuous  functions). 
Definition  2:  By  an  extremal  control  we  mean  the  control  used  in  order 

that  the  system  follow  an  extremal  path. 

Definition  3:  By  the  domain  of  controllability  for  extremals  to  a given 

terminal  state  we  mean  that  subset  of  the  Initial  state 
space  from  which  extremals  lead  to  the  terminal  state. 
Definition  4:  By  the  synthesis  of  an  extremal  control  we  mean  using  the 

basic  necessary  conditions  of  optimality  to  explicitly 
determine  the  time  history  of  an  extremal  control  from 
initial  to  terminal  time  as  a function  of  initial  conditions. 


Complete  results  in  a form  suitable  for  numerical  determination  are  to  be 
found  in  Appendix  G of  [43].  The  "solution"  occupies  twenty  pages  in  143], 
and  this  should  explain  why  for  the  purposes  of  the  paper  at  hand  only 
representative  results  are  given. 
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Our  solution  algorithm  then  is  as  follows: 

(a)  an  extremal  control  law  is  developed  from  the  maximum  principle 
(which  must  be  modified  when  the  trajectory  lies  on  the  boundary  of 
the  state  space);  for  Lanchester  "square-law"  attrition  structures 
the  extremal  control  law  in  many  cases  depends  only  on  relationships 
between  dual  variables  (marginal  returns  from  destroying  targets)  , 

(b)  for  each  terminal  state  an  extremal  control  is  synthesized  by  com- 
bining a backwards  integration  of  the  adjoint  system  of  differential 
equations  with  the  extremal  control  law  and  corner  conditions, 

(c)  for  each  terminal  state  the  domain  of  controllability  for  extremals 
is  determined  by  forwards  integration  of  the  state  equations  using 
the  synthesized  extremal  control  from  (b) , 

(d)  the  solution  is  determined  at  this  point  for  regions  of  the  initial 
state  space  which  are  covered  by  only  (part  of)  the  domain  of  con- 
trollability for  extremais  to  one  terminal  state;  one  must  also  verify 
that  the  entire  initial  state  space  has  been  accounted  for,  since 
otherwise  one  may  have  overlooked  some  type  of  "singular"  surface, 

(e)  if  domains  of  controllability  overlap  so  that  for  a point  of  the 
initial  state  space  contained  in  their  intersection  there  is  more 
than  one  extremal  leading  to  the  terminal  surface,  then  one  computes 
the  return  (or  payoff)  associated  with  each  extremal;  the  optimal 
trajectory  is  selected  from  the  extremals  by  comparing  these  values. 

The  above  solution  algo*  turn  is  a refinement  of  the  one  presented 
in  (34).  Let  us  make  a few  remarks  about  the  application  of  this  procedure 
to  the  prescrived  duration  batt’.e  For  this  problem  we  may  think  of 

For  this  approach  to  work  it  is  essential  that  an  optimal  policy  exist  tor 
(7).  This  has  previously  been  established  in  [37],  [41].  In  this  case 
one  of  the  extremals  must  bt  ar  jptimal  trajectory. 
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time  as  being  an  additional  state  variable.  On  the  other  hand,  c -r  the 

Isbell-Marlow  terminal  control  problem  [34]  time  may  be  considered  as  being 

a parameter  and  consequently  was  eliminated  for  the  determinations  of  step 

(c)  above.  In  other  words,  for  the  Isbell-Marlow  problem  a domain  of 

controllability  was  determined  by  inequalities  involving  the  three  state 

variables;  for  the  prescribed  duration  battle  (7)  such  a determination 

involves  the  four  variables  t , x°,  x?,  and  y„. 

max  1 * 2 0 

For  the  prescribed  duration  battle  we  have  not  been  able  in  all 

cases  to  develop  analytic  expressions  at  step  (c)  in  the  above  algorithm 

as  we  did  for  the  terminal  control  problem  studied  in  [34],  Consequently, 

we  could  not  analytically  accomplish  steps  (d)  and  (e)  for  the  problem  at 

hand.  We  have,  however,  used  computational  methods  to  determine  the  optimal 

control.  We  have  expressed  our  "solution"  (partially  presented  in  the  next 

section)  so  that  given  a point  P°  = (x^.x^.yQ)  in  the  initial  state  space 

and  t , one  can  determine  which  terminal  states  are  reached  by  extremals, 
max 

Thus , we  can  determine  to  which  domains  of  controllability  P°  belongs. 

Then,  using  the  extremal  control,  we  can  numerically  compute  the  return 
(or  payoff)  associated  with  each  extremal  and  select  the  optimal  policy 
from  among  a finite  number  of  possibilities.  A computer  program  was  written 
in  FORTRAN  to  do  the  above  and  computations  performed  on  an  IBM  360  computer. 
4.2.  Summary  of  Solution . 

We  have  applied  the  solution  procedure  of  Section  4.1  to  develop 
a "solution"  in  the  sense  discussed  there.  Without  loss  of  generality  we 
assume  that  a^b^  > a2^2’  l’6,  R > 1.  There  are  two  cases  to  be  considered 

(1)  6 £ 1, 

and  (2)  0 i,  6 < 1, 

where  fi  = a^p/^^q). 
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For  Case  (1):  6^1,  the  domains  oi  controllability  do  not  overlap 
each  other,  and  hence  extremals  extremals  are  unique.  The  extremal  control 
is  thus  the  optimal  control.  The  optimal  policy,  moreover,  may  be  expressed 
in  a particularly  simple  form:  always  concentrate  all  fire  on  X^  while 
x^  > 0.  Further  details  on  domains  of  controllability  and  "event"  times 
are  to  be  found  in  Table  II  of  [43]. 

For  Case  (2) : 0 s i < 1,  some  domains  of  controllability  overlap 

each  other,  and  hence  extremals  are  not  unique  (in  the  sense  that  from  a 
point  in  the  Initial  state  space  the  system  may  be  steered  along  any  one 
of  several  extremals  to  various  end  states  of  battle).  (See  [41]  for  a 
discussion  of  a similar  case.)  Thus,  considerations  "in  the  large"  (i.e. 
step  (e)  of  the  above  solution  procedure)  are  required  to  determine  the 
optimal  policy.  Unfortunately,  explicit  analytic  expressions  are  not 
readily  obtainable  as  they  were  for  the  Isbell-Marlow  terminal  control 
problem  [34].  However,  as  discussed  in  Section  4.1  above,  one  can  use  the 
information  presented  in  Tables  III  of  [43]  (which  is  fifteen  pages  long) 
to  numerically  determine  an  optimal  fire  distribution  policy  for  any  specific 
set  of  model  input  parameter  values.  A representative  sample  of  this  informa- 
tion is  given  in  Table  II. 

In  Case  (2)  the  optimal  fire  distribution  policy  cannot  be  expressed 

in  the  very  simple  form  as  in  the  first  case.  When  Y wins  in  time  less 

than  t (S-.  for  which  the  optimal  policy  is  determined),  the  optimal 

max  7 

fire  distribution  policy  is  preciselv  the  same  as  when  6 i 1.  However,  for 

all  other  cases  (i.e.  terminal  states  S,  through  S ) the  extremal  policy 

l b 

is  to  finish  the  prescribed  dur;-  - ion  oattle  by  firing  at  X(,  regardless  ot 
whether  or  not  X^  has  been  annihilated.  This  differs  from  that  when  <S  t 1 . 
Thus,  we  see  that  force  levels  altect  the  optimal  fire  distribution  policy. 
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Table  II.  A Representative  Parc  of  the  Solution 
to  the  Prescribed  Duration  Battle  for 
Oti<l. 

(Nonrestrlctlve  assumption:  R > 1,  l.e.  a^b^  > a^b^) 


S5:  xt(tf)  - xLUx)  - 0.  x2(tf)  > 0,  y(tf)  - 0,  tf  a tm#x 


Extremal  Control 


: ♦*(!).{ 


1 for  0 < t ( t^  where  x^t^)  “ 0 


0 for  t^  < t * t{ 


Domain  of  Controllability:  a^b^y2  * ,2  “ (b2*2^2 

albly0  * + (R-l)(b2x°)2 


tf  - ti  + 


-l^iVo  - + (b2x2)J> 

’ b2x°/R  ' 


where  t1  ■ t^S^)  “ t^CSj)  8lven  by 


(1)  for  a^y2  > *2 

h ■ 4 


(2)  for  a^y2  < 


lVo  ' + * b2*2 


Vi  yo  - 8 


(3)  for  “1bJy2  - s2 


b,x.  - /a 


IVo  ~ * ib2X°2y 


8 - yo 


£i  ‘ 7=P  tnfe! 


— r \ DoAo ' 

“ibi  2 2 


E:  for  0 X & * R - Zr/R-IV  optimal  paths  also  satisfy  (equality 

yielding  a dispersal  surface) 
for  0 X x“  < (b2x2) /(kb, ) 

. 2 _ j R I . o(r2(R-l)  +•  R]  . . o(2 

«lbly0  * R*  ' I*  1 blK-  2»  + b2*2  ( ’ 

where  k Is  given  bv  k ■ (i2  - R(»-l) 2 ) / (2R) . 


lb 


4.3.  Development  ol  basic  Necessary  Cond 1 t ions  of  Optimality . 

We  will  use  Speyer  and  Bryson's  approach  L 32 ] ^ of  adjoining  the 
state  variable  constraint  directly  to  the  criterion  functional  with  a 

Lagrange  multiplier.  The  Hamiltonian  is  given  by  (see  also  [19]) 


H(t,x,p,*D)  “ "Pi^Daiy  “ p2(1_,,,D)a2y  " P3^blxl+b2X2^  + nl(t)xl  + P2(t)x2’ 


where 


( 


= 0 

for 

o 

A 

•H 

X 

:>  0 

for 

x^  = 0. 

The  adjoint  system  of  differential  equations  for  the  dual  variables  is 

dP. 


'1  3*1  / *v  , / \ 

dT  = “ ^(t>X'P’V  = blP3  ' nl(t)’ 

dp2  9H  , * , . . 

dT"  * -^(t'X’P’V  = b2P3  - n2(t)* 


dp. 

dt 


3H 

9y 


(t’x’pV  = VlPl  + (1^U)a2P2- 


(10) 

(ID 

(12) 


Boundary  conditions  for  the  dual  variables  (also  frequently  called  trans- 

versality  conditions)  are  discussed  below.  When  t,  < t . the  following 
1 f max 

transversal ity  condition  also  holds 


H(t=tf  »x.£»4'u)  * 0. 


(13) 


When  x^,x9  > 0,  the  maximum  principle  yields  the  extremal  control 
law  [ 34 ] , [41  ] 

> 0, 

< 0,  (14) 


= 


1 for  v(t) 


0 for  v(t) 


Taylor  apparently  Is  the  only  person  to  apply  these  Important  results  to 
variational  problems  in  operations  research.  See  [41]  for  discussion  ol 
previous  applications. 
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where  v(t)  = (-p^)a^  “ (“P2^a2’  In  1^4]  we  showed  that  there  are  no 
singular  subarcs  (see  Chapter  8 in  [8])  In  the  solution. 

Without  loss  of  generality,  let  us  consider  a constrained  subarc 
on  which  Xj(t)  - 0 for  t^  £ t £ t^  (and  x^,y  > 0 for  t < t^) . Since 

dxl  * 

= 0,  the  control  is  clearly  $^(t)  - 0 for  t^  £ t £ t^.  The  require- 

U 

ment  that  — ■ 0 yields  the  following  relationship  between  dual  variables 
d<$ 

on  the  constrained  subarc^ 


alpl(t)  - a2P2(t) . 


(15) 


d 3 H 

The  multiplier  q (t)  is  determined  from  the  condition  that  — (— -)  = 0, 

1 at 

and  this  yields 

Pn(t) 

ni(t)  " ~^~(aibra2b2) • (16) 

The  Interpretation  of  n^(t)  (see  [41]  for  a further  discussion)  is  the 
rate  of  marginal  return  to  Y for  keeping  x^  * 0.  Thus,  (intuitively) 

Y tries  to  annihilate  only  when  it  profits  him  to  do  so.  Further- 

more, the  requirement  that  n^(t)  i 0 when  = 0 for  a finite  interval 
of  time  yields  that  we  must  have 


albl  * a2b2' 


(17) 


since  it  may  be  shown  that  p^(t)  > 0 for  t < t^..  The  nonrestr  ic  t ive 
assumption  that  a^b^  > a2^2  R > 1)  implies  that  it  is  nonoptimal 

to  have  x2  = 0 for  a finite  interval  of  time. 

Furthermore,  when  the  necessary  conditions  of  optimality  are  expressed 
in  Speyer  and  Bryson's  format  [32]  (see  also  [19]),  the  corner  conditions 


^The  development  of  (15)  requires  a slightly  different  argument  when  t = t 
and  y ( t ) = 0.  See  [41]  for  a further  discussion  of  this  point. 


"We  adopt  the  convention  tnat  ttt.m.n; 


1 ul  ciuii' 


0 


u 
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(see  pp.  125-126  of  [8])  take  a particularly  simple  form  for  a first  order 
SVIC:  the  adjoint  variables  are  continuous  across  all  corners  (both 

interior  to  and  on  the  boundary  of  the  state  space) . In  other  words 

P(t  ) * P(t+) , (18) 

«*'  c c 

where  t denotes  the  time  just  before  the  corner  (i.e.  a left-hand  limit) 
We  also  have  that 

H(Cc’i(tc)lL(tc)’Vtc))  “ H(tc’~(tc),£(tc)**D(tt))-  (19) 

On  entry  to  a constrained  subarc  with  x^(t)  = 0 for  t^  £ t £ t^ , (19) 

yields 

aiPl^tl)  = alPl^tl)  = a2P2(tl)  = a2p2^tl)'  (20) 

Let  us  finally  consider  the  boundary  conditions  for  the  dual 

variables  at  t = t^.  The  nonrestrictive  assumption  that  a^b^  > a2^2 

yields  that  no  extremals  lead  to  SQ.  The  three  terminal  states  S. , S , 

o 1 L 

and  may  be  discussed  collectively.  In  all  three  cases  the  length  of 

the  battle  is  equal  to  t . Then,  according  to  the  results  presented 
n max  r 

in  [42],  we  have 

for  S^,  , and  S^: 

Px(tf)  = -p  + \>x,  P2(tf)  = -q  + v2,  P3(cf)  = r > 0,  (21) 

where 


I 
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= 0 

l 

for 

xi (tf ) 

> 0, 

vi 

10 

for 

xi(tf) 

= 0 but  x^(t)  > 0 for  t < t^. , 

l 

unrestricted  for 

x^(t^)  - 0 and  x^(t)  = 0 

for 

Ci  * C 

£ t^  with  t^  < t£. 

The  latter  condition  that,  for  example,  the  multiplier  is  unrestricted 

when  the  system  is  on  a constrained  subarc  for  a finite  interval  of  time 
is  because  the  boundary  of  the  state  space  is  "absorbing"  (i.e.  the  state 
constraint  x^  i 0 essentially  acts  like  a terminal  equality  constraint 
as  far  as  the  determination  of  boundary  conditions  for  the  adjoint  variables 
[42]).  if  there  were  replacements  in  the  model  (7)  so  that  the  boundary 
of  the  state  space  would  not  be  "absorbing,"  then  we  would  have  v^  2:  0 
for  ^(tf)  = 0. 

For  S. , S. , and  S,  the  duration  of  the  battle  t,  is  determined 

4 J D I 

by  the  terminal  equality  constraint  y(t._)  * 0 when  t,  < t so  that 

1 ■'  f f max 

the  transversality  condition  (13)  yields  p (t.)  - 0.  When  t , = t , 

j 1 I lUciX 

additional  analysis  is  required,  and  this  is  discussed  in  Section  4.4 

below.  Then,  again  according  to  the  results  presented  in  [42],  we  have 

for  S . , Sc , and  S,  : 

4 o o 

Pj(tf)  = -p  + Vj  , P2  (if ) ■=  -q  + v2,  p.j(tf)  * °*  (23) 

where  the  multipliers  for  1 * 1,2  are  again  given  by  (22). 

For  S;:  x1(tf)  =*=  x2(t{)  = 0,  y(tf)  >0,  tf  £ t^,  we  have  [8] 

pl  (t  f ) = -p  + v1#  P2(tf)  “ _cl  + v2>  p3(tf)  = r > °»  (24) 

since  t ^ is  determined  by  the  (equality)  terminaL  constraints  x^(t()  - 0 
and  x.,(tj)  = 0.  Since  these  are  equality  constraints,  the  multipliers 
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v^  and  are  unrestricted  in  sign.  Since  t^  is  unspecified,  the 

* 

transversality  condition  (13)  with  (^(t^)  “ 0 yields  that  “P2^tf^a2y  = U 
so  that  p^(tj)  = 0 and  * 9-  The  condition  (15)  which,  in  particular, 
holds  at  t ■=  tj  yields  that  p3(t^)  “ 0*  Thus,  we  have 

for  S7[x^(tf)  - 0 before  x2(tf)  “ °»  y(cf)  0]  •' 

pl(tf)  » 0,  p2 (tjr)  * 0,  P3(tf)  = r-  (25) 

4.4.  Synthesis  of  Extremal  Control. 

For  each  terminal  state,  extremals  may  be  synthesized  by  combining 
the  conditions  which  must  hold  on  a constrained  subarc  and  the  extremal 
control  law  (14)  with  a backwards  integration  of  the  adjoint  equations  (10) , 
(11)  and  (12)  . The  boundary  conditions  for  the  adjoint  variables  given 
in  Section  4.3  and  the  corner  conditions  (18)  and  (19)  are  used  in  this 
backwards  sweep  process.  It  is  convenient  to  use  the  switching  function 
v(t)  = (-p^Ja^  " (-p2^a2  synthesizing  extremals.  Using  (10)  and  (11), 

we  readily  find  that  for  t < t^ 

= p3(t) (-a1b1+a2b2)  < 0,  (26) 

since  p_3(t)  > 0 for  t < t^. 

Details  in  the  synthesis  of  extremals  are  similar  to  those  presented 
in  [ 34 ] — [ 38 ] , [41],  and  [43],'  and  hence  they  are  omitted.  The  treatment 
in  [37]  is  most  similar  to  the  problem  at  hand.  Details  for  5 i 1 and 
for  0 £ 6 < 1 are  different. 

There  are  two  interesting  aspects,  moreover,  that  we  encountered 
in  synthesizing  extremals.  These  are 


In  some  of  these  references  the.  non-negativity  of  the  force  levels  (i.e. 
SVIC's)  have  been  treated  by  means  other,  than  Speyer  and  Bryson's  approach 
[8].  The  basic  principles  of  working  backwards  from  the  end,  however,  ate 
the  same  in  all  applications. 
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(a)  when  0 £ 6 < 1 and  a switch  In  the  target  type  upon  which  all  Y- 
flre  is  concentrated  occurs  without  the  annihilation  of  a target  type, 
the  switching  time  depends  upon  the  initial  force  levels  and  possibly 
the  valuation  of  Y survivors,  and 

(b)  when  P°  = (x^.x^.y^)  i9  such  that  when  6 < 1 an  extremal  leads 

S S 

to  S.  (i.e.  we  reach  S,  with  a switch  in  tactics)  with  t,(S.) 

4 4 14 

< tmax>  we  can  possibly  also  steer  the  system  to  an  end  point  with 
y(tj=t  ) * 0 without  violating  any  necessary  conditions  of  optimality. 
Let  us  first  discuss  the  dependence  of  the  non-annihilation  switching 
time  on  force  levels  and  valuation  of  Y survivors.  Such  a switch  in 
fire  distribution  only  happens  for  6 < 1.  Let  us  compare  the  situations 
for  extremals  leading  to  and  S^.  In  both  cases  we  have 


**(t) 


1 for  0 £ t at  tf  - Tj, 


0 for  t{  - < t S tf, 


(27) 


where  Xi^ts=tf-Ti^  > 0*  It  -*s  convenient  to  introduce  the  "backwards  time" 

* 

t defined  by  t = - t.  Then  when  6 < 1,  we  have  4> ^ C t ) = 0 for 

0 £ t £ where  i ^ denotes  the  backwards  time  of  the  first  switch  in 

fire  distribution.  For  S^[x^(t^)  > 0,  x^(tf)  > 0,  y(t^)  “0,  t^  < tmaxl 

it  may  be  shown  using  (10)-(12),  (14),  (23),  and  (26)  that ^ 


r,(S.)  = 

1 A /Tb 


^ cosh  ^z. 


(28) 


2 2 


whe 


re  z = (R-6) / (R-l) . For  S^x^tj)  > 0,  x^(tf)  > 0,  y(tf)  = 0, 


t,.  = t 
t max 


it  may  be  shown  that 


Further  details  of  the  results  summarized  in  this  section  are  to  be  found 
in  [43].  To  keep  the  paper  at  hand  from  being  too  long,  we  have  omitted 
them . 


LI 


where 


W 


„ rz+Zz^+a^-l  \ 

in(~ T7“J  * 


(29) 


(30) 


The  following  theorem  Is  of  Interest  (see  [36]  for  a similar  result). 
THEOREM  1:  Assume  that  R > 1 and  6 < 1. 

Then, 


T1(S1}  < W* 


A proof  of  Theorem  1 is  given  in  [43].  Furthermore,  it  is  readily  shown 

that  lim  t^(S^)  = 0.  Thus,  when  6 < 1,  the  switching  time  t - t^  - 
r-H-o> 

t^(S^)  along  extremals  leading  to  explicitly  depends  on  the  value 

Y places  upon  the  survival  of  his  own  forces.  The  higher  he  values  Y- 
force  survivors,  the  longer  Y forces  concentrate  their  fire  on  when 

6 < 1.  For  extremals  leading  to  S^,  the  transversality  condition  (13) 
yields  that  Y-force  survivors  have  zero  value.  Intuitively,  we  see  that 
firing  longer  at  X^  prolongs  the  length  of  battle  for  those  cases  when 
y(tj;)  = 0,  since  a^b^  > However»  for  extremals  leading  to 

this  is  not  an  optimal  tactic. 

Let  us  therefore  consider  the  case  when  tr  = t for  S. . We 
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just  discussed  above  the  possibility  when  R > 1 > 6 of  prolonging  the 
length  of  battle  along  an  extremal  leading  to  by  firing  longer  at 

X^ . Using  (27),  it  may  be  shown  that 


r ( t — T ^ t 


(b  x (tf-T  )+b  x ) 

k t - s inh  /a  K 


'l 


I 

i 


f 


f ‘1 


2 2 1 


2 2 ll 


(31) 
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where 


and 


T^(a)  = 


( z+Zz^+a^-li 

— i ~rr~)' 


(r-f-v) 


a., 


(32) 


(33) 


where  v is  the  multiplier  corresponding  to  the  terminal  constraint 

y(tf ) = 0.  Then,  the  following  lemma  may  be  established  [43]. 

LEMMA  1:  Consider  an  extremal  leading  to  with  y(t^) 

given  by  (31)  and  tf  defined  by  y(tf)  * 0.  Then 
3y(tf)  1 

<0  if  and  only  if  a^b^y^  < ‘ 


In  [43]  it  is  whown  that  by  increasing  the  implicit  valuation  of  Y 

survivors  (i.e.  v in  (33))  the  length  of  battle  may  be  extended  until 

Cf  = '"max’  However,  this  is  not  an  optimal  policy.  This  situation  in 

which  a special  case  (here  tr  = t for  S,)  requires  an  inordinate 

f max  4 

amount  of  analysis  unfortunately  has  arisen  in  all  problems  that  we  have 
studied . 

4.5.  Obtaining  an  Optimal  Policy. 

After  extremals  have  been  synthesized,  domains  of  controllability 
for  extremals  may  be  obtained  as  shown  in  [34].  It  then  remains  to  apply 
steps  (d)  and  (e)  of  the  solution  procedure  given  in  Section  4.1.  A 
computer  program  written  in  FORTRAN  has  been  developed  to  assist  in  the 
determination  of  an  optimal  policy.  This  computer  program  does  the  follow- 
ing: for  a given  point  in  the  initial  state  space,  we  determine  to  which 

terminal  states  extremals  lead.  Then,  the  payoff  corresponding  to  eacli 
extremal  is  computed.  The  optimal  path  (and  lienee  the  optimal  policy)  is 
readily  obtained  by  determining  which  extremal  yields  the  largest  return 


to  Y . 


In  the  above  fashion,  the  optimal  fire  distribution  policy  may  be 
obtained  as  an  open-loop  control.  After  this  has  been  obtained,  It  Is  a 
straightforward  matter  to  express  the  optimal  policy  as  a closed-loop 
control.  In  doing  this,  it  is  convenient  to  cite  the  principle  of  optimality 
[1]  (a  special  case  of  Isaacs'  tenet  of  transition  [17]  (see  also  [2])), 
i.e.  every  subarc  of  an  optimal  trajectory  is  itself  an  optimal  trajectory. 


5.  Determination  of  an  Optimal  Policy  for  Stochastic  Problem. 

In  this  section  we  discuss  how  an  optimal  fire  distribution  policy 
(expressed  as  a closed-loop  control)  may  be  determined  for  (8) . Using 
the  formalism  of  dynamic  programming,  we  develop  the  fundamental  functional 
equation  for  the  optimal  expected  value  function.  This  is  a sufficient 
condition  of  optimality:  a control  which  leads  to  the  satisfying  of  this 

equation  is  an  optimal  policy  (see  [29]).  An  analytic  solution  is  developed 
to  the  fundamental  functional  equation  for  very  small  numbers  of  combatants. 
Finite  difference  methods  are  used,  however,  to  generate  a numerical 
approximate  solution,  since  a general  solution  (for  arbitrary  numbers  of 
combatants)  has  not  been  obtained  to  the  fundamental  functional  equation. 

5 . 1 Development  of  Fundamental  Functional  Equation . 

Let  S (t ,m^ ,n)  denote  the  optimal  expected-value  function  (see 
[12]).  Then 


S(T,m  ,m  ,n)  = maximum  E [rN(x=0)  - pM  (x=0)  - qM  (x=0) ] , (34) 

<|>s  C V ~ 

where 

the  system  state  is  m^.m^.n  at  time  x (i.e.  M^(x)  = m^ , etc), 

<t  is  tiie  class  of  admissible  controls  (i.e.  ^ must  always  be 

1 2 

chosen  from  the  set  of  rational  numbers  (0,  , . , . . . ,1))  , 

n(x;  n(x; 


A 


V 
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x = t.^  - t is  the  "backwards  time"  from  the  end  of  battle  (which 
begins  at  t = 0)  , 

E denotes  mathematical  expectation  given  that  m(i)  = (m  (tj  , 

1U  y T ^1 

n>2 (t)  .«(*))  » 

casualties  occur  in  a random  fashion  between  t and  t^. 

In  other  words,  S(x,m^,m2,n)  is  the  maximum  return  that  we  get  on  the 

average  when  we  start  with  force  levels  m^.m^,  and  n at  t = t^  - x , 

* 

follow  an  optimal  policy  $ (s,m  ,m ,n)  (chosen  from  the  class  of 

b i.  Z 

admissible  policies  $)  for  t £ s £ t^,  and  casualties  occur  in  a 
random  fashion. 

We  consider  that  casualties  occur  as  a Markov  process  with  discrete 
state  space  (or  discontinuous  Markov  process).  Specifically,  we  assume 
that 


(1)  the  attrition  process  is  a continuous  parameter  Markov  chain  with 

stationary  transition  probabilities  corresponding  to  a deterministic 
Lanchester  square-law  attrition  process;  this  is  equivalent  to 
assuming 

(a)  the  future  occurrences  of  casualties  depend  only  on  the  state 
of  the  system  at  t and  not  on  past  history, 

(b)  the  transition  probabilities  depend  on  only  the  state  of  the 
system. 


(c)  ,,  , [one  X,  casualty, 

Prob | . . 1 , 

Lin  interval  At 


Prob 


Prob 


lone  X9  casualty! 
|_in  interval  At 

[one  Y casualty! 
in  interval  At j 


4>a^nAt , 
(l-4>)a2nAt , 
(b^+b^)  At , 


J 


where  $a^n  is  X^ 


casualty  rate,  etc.. 


2b 


Prob 


[1 


more  than  one  casualty] 
in  interval  At 


J 


0((At)2), 


where  0(x) 


denotes  dependence  on  x such  that 


lini 

x-K) 


0(x) 

x 


const . , 

(2)  the  Y-forces  have  perfect  information  as  to  the  state  of  the  system 
at  t and  the  expected  casualty  rates, 

(3)  the  Y-forces  can  instantaneously  shift  fire  from  any  target  at  any 
time , 

(4)  the  length  of  the  battle  is  known. 

Then,  we  have 


state  variables:  (t) ,M^ (t) ,N(t) , 

decision  (or  control)  variable:  $ , 

J 


where 


6 $ " ^°’n(t)  ’n(t) n(t) 


To  be  more  precise  = <(>  (t,m  ,m  ,n)  is  a closed-loop  (or  feedback) 

O J J.  Z 

control . 

To  develop  the  fundamental  functional  equation  for  the  optimal 
expected-value  function,  we  begin  by  considering  any  interval  of  "backwards 
time"  of  length  At  which  occurs  from  t - xA  to  x.  There  are  five 
exhaustive  and  mutually  exclusive  possibilities  for  random  events  to  occur 
in  such  an  interval.  These  are 

(1)  one  X^  casualty  occurs, 

(2)  one  casualty  occurs, 

(3)  one  Y casualty  occurs, 

(4)  no  casualty  occurs. 


(5)  more  than  one  casualty  occurs. 
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Let  us  now  examine  each  of  these  cases  and  develop  expected  returns. 

(1)  One  X^  casualty  occurs  in  At: 

By  our  assumptions  above,  we  have  for  the  probability  of  occurrence 
of  this  event 


Prob[one  X^  casualty  occurs  in  Ax]  » 4>a^nAx  . 

Given  that  one  X^  casualty  is  realized  in  the  interval  from  x to 
x - Ax , the  optimal  fire  distribution  policy  for  Y will  consider  the 
maximum  expected  value  for  the  return  functional  as  casualties  continue 
to  occur  randomly  from  x - Ax  to  x *=  0,  This  maximum  expected  value 
is  SCx-Ax.m^Cx-Ax)  ,m2(x-Ax)  ,n(x-Ax))  where  m^Cx-Ax)  = ni2(x)-l, 
m^ (x-Ax)  = n>2 (x)  , and  n(x-Ax)  « n(x). 

(2)  One  X^  casualty  occurs  in  Ax: 

Similarly,  we  have  that 

Prob[one  casualty  occurs  in  Ax]  = (l-$)a2nAx, 

with  the  optimal  expected-value  function  S (x-Ax ,m^ (x) ,m2 (x) -1 ,n(x ) ) . 

Events  (3)  through  (5)  are  analyzed  in  a similar  fashion. 

Now,  by  the  standard  dynamic  programming  argument  which  combines 
the  probabilities  of  events  (1)  through  (5)  above  with  the  maximum  expected 
return  to  be  achievable  given  these  events  occur,  we  obtain  the  expression 

S(x ,m  ,m  ,n)  = maximum] [ 1-Axt $„a  n+(l-$  )a  n+b  m +b  m } ]S (x-Ax ,m  ,m  ,n) 

1 1 Oxf -*1 

*st  * 

-t-^a^nAxS  (x-Ax  .m^-l  ,m2  ,n)  ( 1 -!>,.) a ^nAxS  (x-Ax  .m^  ,m .,-1  ,n) 

+ (lb,m1+b2m2)AxS  (x-Ax  .m^  ,m?  ,n-l)  } . (3S) 


k 


! 
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Rearranging  terms  in  (35)  and  taking  the  limit  as  At  -*■  0,  we 
obtain  the  fundamental  functional  equation  for  the  optimal  expected-value 
function 

for  m^ ,n  > 0 : 

dS 

— (x,m1,m2,n)  = (b^+b^)  {S  (t  .n^  ,m2  ,n-l)  - S(x,m  ,m  ,n) 

+ n maximum[$  a {S(x,m  -l,ra  ,n)  - S(x,m  ,m  ,n)  } 

0£$  SO.  1 1 Z 

+ (l-(fs)a2  S(T,m1,m2-l,n)  - S (x  .n^  ,m2  ,n)  } ] , (36) 

with  the  boundary  condition  at  t = t^ 


S(x*=0,m1,m2,n)  = rn  - pm  - qm2, 


(37) 


where  and  n are  integers  and 


4>  . {o  — — X1)-1)  i } 

1 ’n’n n 


(38) 


Special  forms  of  (36)  in  which  m^  - 0,  etc.,  will  be  given  later. 

More  concisely,  we  could  have  said  that  (36)  results  from  combina- 
tion of  the  well-known  formalism  of  dynamic  programming  with  the  retrospective 
(backward)  probabilistic  evolution  of  the  system  over  time  (c.f.  [13],  [22]). 
It  should  be  noted  that  (36)  is  a special  case  of  an  equation  given  by 
Kushner  in  1962  [21]. 

If  we  take  (36)  to  be  the  basic  equation  for  S(x,m  ,m  ,n) , then 
(35)  may  be  considered  to  be  the  simplest  finite  difference  approximation 
to  it,  i.e.  the  result  of  applying  the  well-known  Euler's  method  to  (36) 

(see  pp.  130-131  of  [15]).  (Of  course,  a method  employing  a higher  order 
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approximation  scheme  (see  pp.  132-140  of  [15])  may  be  necessary  under  many 
circumstances.)  We  will  find  this  point  of  view  convenient  when  we  consider 
developing  a solution  to  (36) . 

Alternatively,  we  could  have  taken  a discrete  parameter  Markov 
chain  as  our  basic  combat  model.  It  is  readily  shown  that  an  optimal 
policy  exists  for  this  latter  formulation  (see  Theorem  1 on  pp.  88-89  of 
[22]),  and  that  a policy  which  yields  the  maximum  in  (35)  is  an  optimal 
policy  (see  Theorem  2 on  p.  89  of  [22]). 

5.2.  On  the  Analytic  Solution  of  the  Fundamental  Functional  Equation. 

The  first  task  in  determining  an  optimal  fire  distribution  policy 
(which  requires  obtaining  the  solution  to  (36)  and  (37)  is  to  develop  the 
entire  system  of  equations  (c.f.  equations  (2)  through  (4)).  We  must, 
therefore,  develop  the  form  that  (36)  takes  at  the  boundary  of  the  system, 
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for  m^>0 ,m^=0 ,n>0 : — (r.m^.O.n)  = {S (t ,0 ,n-l) 

- S (x  ,m^  ,0  ,n)  } + ajn{S(x,m^-l,0,n)  - S (t ,m^ ,0 ,n) } . (42) 

Equations  (36)  through  (42)  are  the  complete  system  of  equations  for  the 
optimal  expected-value  function  in  the  optimal  control  of  the  Lanchester 
stochastic  process. 

For  m^  > 0,  m^  > 0,  n > 0 the  optimal  fire  distribution 
policy  is  determined  by  the  maximization  operation  in  (34) , and  hence 


("t  »m^  .m^ , n) 


1 for  W(i,m^,m2,n)  > 0, 


0 for  W(T,m1,m2,n)  < 0, 


where  we  shall  refer  to  W(t ,m^ ,m2 ,n)  as  the  "switching  function."  It  is 


defined  by 


for  m^  > 0 , m^  > 0 , n>0. 


W(T,m1>m2,n)  = a^S^.n^-l.m  ,n)  - S (t  .n^  ,m2  ,n)  } 


- a2(S(T,m1,m2-l,n)  - S (t .m^ ,m9 ,n) } . 


Let  us  observe  that  at  the  end  of  the  battle  at  t = t^.,  we  may  combine 
(37),  (43),  and  (44)  to  obtain 


$s(x«0,m  ,m  ,n) 


1 for  a^p  > a2q. 


0 for  a^p  < a9q , 


which  is  similar  to  results  for  the  optimal  control  of  the  deterministic 
process  (7)  (see,  for  example,  (14),  (21),  and  (22)). 

It  should  be  noted  that  equations  (36)  through  (42)  have  the  same 


form  as  those  for  the  Lanchester  square-law  attrition  stochastic  process 
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(i.e.  equations  (2)  through  (4)  when  the  attrition  rates  are  given  by  (b)). 
A general  solution  has  not  been  obtained  to  these  equations.  Nevertheless, 
it  is  of  value  to  develop  a partial  solution.  For  example,  since  we  use 
finite  difference  methods  to  generate  an  approximate  solution  (see  Section 
5.3  below),  it  is  desirable  to  check,  the  adequacy  of  the  approximation  (in 
particular,  the  "time  step  size"  used  in  the  numerical  propagation  of  the 
approximate  solution  by  "marching  ahead  in  time") . This  is  easily  done  by 
comparing  the  approximate  solution,  denoted  as  S,  to  the  exact  analytic 
solution,  denoted  as  S.  Hence,  a partial  analytic  solution  is  useful. 

Careful  consideration  of  (36)  through  (42)  reveals  that  there  are 
restrictions  on  the  order  in  which  the  optimal  expected-value  functions 
S(r  ,m^  .n^.n)  for  m^  * 0,1,2,...,  etc.,  can  be  computed.  In  particular 
an  admissible  sequence  for  building  up  the  solution  through  S(t, 1,1,1) 
is  shown  below  in  Table  III. 

m^  n 

0 0 0 

10  0 
0 10 
0 0 1 

110 
0 11 
10  1 
111 

Table  III. 

Admissible  Order  for  Computing  Optimal  Fxpected-Value 
Functions  (admissible  order  is  from  top  to  bottom). 


We  note  that  (36)  becomes  a first  order  system  of  ordinary  differential 
equations  for  Stt.mj.m^.n)  when  as  determined  by  (43)  is  used.  Solving 

for  Sft.m^.m^.n)  for  » 0,1,2,...,  etc.,  we  can  then  determine  $ by 

(43).  The  synthesis  of  an  optimal  control  by  combination  of  the  control  law 
(43)  with  Integration  of  a system  of  differential  equations  is  similar  to 
that  for  deterministic  optimal  control  problems. 


A 
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We  readily  successively  compute  using  (39)  through  (42) 


S(t, 0,0,0)  = 0,  S(t, 1,0,0)  » -p,  S(i, 0,1,0)  = -q. 


S(x, 0,0,1)  = r,  S(x, 1,1,0)  = -p  - q, 


„ „ , (b2r~a2q)  "'W'  ^ fa2r~b2<11 

S(T,0.1,1>  - [ * [ a2+b2  J. 


,,  , „ fVlVl  -<ai+bi>'  [W 

S(i, 1,0,1)  - +b  • + +b 


Using  (46),  equations  (36)  and  (37)  become  for  m^  « 1 , = 1 , n = 1, 

^-(t, 1,1,1)  = -(b1+b2)  {S(t  ,1  ,l,l)  + (p+q)  } + maximum  [ 4>ga1  { S (t  ,0 ,1 ,1) -S (x  ,1,1,1)) 

d>s=0  or  1 s 


+ (l-^s)a,{S(T, 1,0, 1)-S(x, 1,1,1))],  (47) 


(t-0, 1 ,1 ,1)  * r - p - q, 


wliere  S(t, 0,1,1)  and  S(t, 1,0,1)  are  given  by  (46). 


Using  (43),  (44),  and  (45),  we  may  readily  solve  (47).  As  for  the 
deterministic  formulation,  there  are  two  cases  that  must  be  distinguished 

Case  (1)  a^p  ^ a^q, 

Case  (2)  a^p  < a^q. 

* 

For  Case  (1):  a^p  s a^q,  we  have  that  $.(x, 1,1,1)  - 1 for  0 & x £ 
where  x , denotes  the  "backwards  time"  of  the  first  switch  in  the  optimal 
fire  distribution  policy.  Thus  is  the  smallest  x which  satisfies 

W( t*x j , 1 , 1 , 1 ) - 0 with  W(x, 1,1,1)  given  by  (44). 
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for  0 S t S when  a^p  z a^4 


1,1.1)  ^ 1) 


S(T, 1,1,1) 


Hj (b^r-a^q)  -(a2+b2)x  ( ( (bi-a2^bl+b2^+al  bJ  ' r 

(al+bl-a2) (a2+b2>  ^ +|  (a1+bi~a2) (a1+b1+b2> 

alP  aia2q ) -(a1+b1+b2)T 

(a1+b1+b2)  (a^+b^-a2> (a^+bj+b2)  j e 


+ 


( ‘ia2r 

i(a2+b2)(»1+b1+b2) 


(bl+b2)p  [(b1+b2)(a2+b2)+a1b2]q| 
(a^-Ha  1'*'b2)  (a2+b2^  ^al+bl+b2^  ^ 


(46) 


We  note  that  t ^ might  be  equal  to  -H*>,  i.e.  we  never  switch.  Assuming 
that  a switch  in  targets  does  occur,  however,  let  us  denote  S (t=t^  ,1 , 1 , 1 ) 
by  Sq  where,  as  we  recall,  is  the  smallest  t which  satisfies 

k 

W(t=t^  ,1,1,1)  = 0.  Then,  we  have  that  ^ (t, 1,1,1)  = 0 for  < t £ t2> 
where  t2  denotes  the  "backwards  time"  of  the  second  switch  in  the  optimal 
fire  distribution  policy.  Then,  we  have 

* 

for  Ti  < T s t2  when  a^P  ^ a2q  (^s(t, 1,1,1)  = 0) 


S(x, 1,1,1) 


a2(b1r-a1p) 

( a 2"*"b 2 — a 1 > ^al+bl^ 


I “(ai+bi)T  ^a2+b2) ^Tl_T^_alTl-blT 

e - e 


a^r  [ (b^b^,)  (a1+b1)+a2b1]p  (b^b^q  | (a2+b2+t>P  (T  1~T ) 

S0  (a^+b^)  (a2+b2+b^)  + (a^+b^)  (a2+b2+b1)  + (a2+b2+b^)  J 6 

| ala2r  t (b2+b2) (a1+b1)+a2b1 ]p  (b1+b2)q  | 

+ | (a1+b1)  (a^b^bj^)  (a^b^  (a2+b2+b1)  ^a2+b2+blM 


Again,  we  note  that  t2  might  be  equal  to  i.e.  we  might  never  redis- 

tribute fire  a second  time.  Assuming  that  a second  switch  in  fire  distribu- 

k 

tion  does  occur,  we  have  1,1,1)  = 1 for  t2  < r £ i^.  We  have  not 

« 

carried  out  the  computation  of  S(x, 1,1,1)  past  x2. 
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For  Case  (2):  a^p  < a^q,  t lie  results  are  symmetric  to  the  above 

(interchange  the  roles  of  X^  and  X^)  and  hence  are  omitted. 

Although  the  above  constitutes  a complete  development  for  S(t, 1,1,1) 

* 

(and  hence  $ (t, 1,1,1)  via  W(x, 1,1,1)),  these  results  are  complex 

■k 

enough  that  it  is  not  immediately  clear  how  4>c  (t  , 1 ,1 ,1)  changes  over 

O 

time  and/or  depends  on  model  parameters.^ 

5.3.  Development  of  Numerical  Solution. 

With  the  advent  of  modern  high-speed  digital  computers,  finite 
difference  methods  of  obtaining  an  approximate  solution  are  commonly  used 
when  an  analytic  solution  cannot  be  obtained  to  equations  like  (36)  through 
(42).  Euler's  method  (see  pp.  130-131  of  f 15 ] ) yields  the  simplest  finite 
difference  approximation  for  (36).  Let  u^  denote  the  approximation  to  the 
optimal  expected  value  function  as  S.  We  shall  compute  values  for  this 
approximation  at  discrete  points  in  time  separated  by  a constant  amount 
At.  We  let  t “ 4At  so  that  t^  = LAt.  Then  (36)  may  be  approximated 


for  m^  > 0,  m^  > 0,  n > 0: 


S ( (4+1)  At  .m^  .m^  ,n)  = { 1-(At)  (b^m^+b,,!!^)  }S  (£,At  ,m^  ,m^  ,n)  + 

(At) (b  m +b,m  )S (4At ,m  ,m  ,n-l)  + n(Ax)  maximum^  a { S (4At  ,m  -1  ,m  ,n) 

1 1 z L 11 

♦s** 

- S ( 4At  ,m^  ,m9  , n)  } + (1-^)  S (4At  .m^  ,m2-l  ,n)  - S (4At .m^ .m^ ,n) } J , (50) 

for  t = 0,1,..., L-l  with  boundary  condition  (37)  and  also  (38).  Similar 
approximations  may  be  developed  for  (41)  and  (42). 


We  recall  that  for  the  deterministic  formulation  when  X|(t()  > 0 and 
x^ftj)  > 0,  the  conditions  a^p  S a,q  and  a^b^  > a->l>  > implied  that 

**(t,x  x.,y)  = 1 for  the  entire  battle. 
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As  noted  above,  consideration  of  (36)  through  (42)  yields  that 
there  are  restrictions  on  the  order  in  which  the  optimal  expected-value 

A 

function  S (or  its  approximation  S)  is  computed.  The  computation  of 
S ( (£+1)At ,m^ .m^ ,n)  depends  upon  the  quantities  shown  in  Figure  1 below. 


Figure  1.  Dependence  of  Optimal  Expected-Value  Function 
on  Discrete  State  Variables. 

Based  on  the  dependence  depicted  in  Figure  1,  the  solution  can  be  "built- 
up"  as  shown  in  Table  IV. 

It  remains  to  discuss  the  adequacy  of  the  finite  difference  approxi- 
mation (50).  It  is  well-known  (see  pp.  130-145  in  [15])  that  Euler's 
method  yields  a finite  difference  approximation  for  such  a system  of 
differential  equations  that  is  both  consistent  and  stable  so  that  the 
approximate  solution  S can  be  guaranteed  to  converge  to  the  exact  analytic 
solution  S as  At  ->  0 (and  L -+■  ®)  [28].  However,  Ax  must  not  be  too 

large  in  order  to  keep  the  truncation  error  satisfactorily  small.  Moreover, 
the  time  step  size  At  is  also  limited  by  the  fact  quantities  like 
(At) (b^m^+b^m^)  or  a^nAx  or  a^nAx  in  (50)  represent  probabilities  and 
lienee  must  be  less  than  one.  in  our  computational  work^  we  have  used  a 


A computer  program  has  been  written  in  FORTRAN  for  this  purpose. 


COOro— 'N;OON:f--H-ot—  OO 


Table  IV.  Admissable  Order  for  Computing  Optimal 
Expected-Value  functions. 


0 

1 


2 

1 

•> 

2 

1 

1 

2 

1 

2 

2 

3 

0 

0 


0 

0 

1 

0 

1 

1 

0 

1 

0 

2 

0 

1 

2 

2 

2 

1 

2 

0 

0 

0 

1 

2 

1 

2 

2 

1 

2 

0 

3 

0 


n 

0 

0 

0 

1 

0 

1 

1 

1 

0 

0 

2 

0 

0 

0 

1 

2 

2 

1 

2 

2 

1 

1 

2 

1 

2 

2 

2 

0 

0 

3 


3 

3 

1 

2 

3 

0 

0 

0 

0 

0 

3 

3 

1 

2 

3 

3 

3 

3 

3 

1 

2 

3 

1 

2 

1 

2 

3 

1 

1 

2 

3 

2 

3 

3 

4 
0 


1 

2 

3 

3 

3 

3 

3 

1 

2 

3 

0 

0 

0 

0 

0 

1 

2 

1 

2 

3 

3 

3 

3 

3 

1 

1 

1 

2 

3 

2 

3 

3 

2 

3 

n 

4 

etc . 


n 

0 

0 

0 

0 

0 

1 

2 

3 

3 

3 

1 

2 

3 

3 

3 

1 

1 

2 

2 

1 

1 

1 

2 

2 

3 

3 

3 

3 

3 

3 

2 

3 

3 

3 

0 

0 


Note : 


Admissible  order  is  top  to  bottom,  starting  with 
column  (composed  of  m^,  m^,  n)  on  left, 
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time  step  size  which  yields  agreement  in  the  fourth  decimal  place  to  the 
right  of  the  decimal  point  when  S is  compared  to  the  exact  analytic  solu- 
tion S in  the  special  cases  (such  as  (48)  and  (49))  when  the  latter  has 
been  obtained. 


6 . Comparison  of  Results  from  Deterministic  and  Stochastic  Formulations. 

In  this  section  we  compare  the  structures  of  the  optimal  fire  dis- 
tribution policy  between  the  deterministic  control  problem  (7)  and  the 
stochastic  control  problem  (8).  Before  presenting  this  comparison,  it 
seems  appropriate  to  discuss  some  general  methodological  considerations. 

Any  comparison  between  the  two  models  should  be  guided  be  the  purpose 
of  the  comparison.  In  the  paper  at  hand  our  purpose  is  to  consider  whether 
the  structure  of  the  optimal  fire  distribution  policy  is  the  same  for  the 
two  formulations.  In  other  words,  we  would  like  to  determine  upon  what 
groups  of  model  parameters  the  optimal  allocation  rule  depends  and  whether 
this  depends  upon  the  particular  form  of  model  adopted  (here  deterministic 
or  stochastic) . The'  things  that  can  be  compared  between  the  two  models 
are  (1)  the  optimal  fire  distribution  policy  and  (2)  the  optimal  (expected) 
return.  It  is  the  opinion  of  the  authors  that  the  second  criterion  (i.e. 
optimal  return)  is  only  significant  when  there  are  differences  between  the 
optimal  policies  from  the  two  models.  Furthermore,  there  are  two  types  ol 
comparisons  that  we  can  make  between  the  models:  one  is  quantitative  and 

the  other  is  qualitative. 

A direct  quantitative  comparison  of  the  optimal  policies'  obtained 
from  the  two  formulations  is  impossible:  on  the  one  hand  for  the  determinist i> 


The  only  papers  known  to  the  authors  in  which  a quantitative  comparison 
between  results  for  deterministic  and  stochastic  optimal  control  problems 
is  made  are  [48]  and  [49].  In  both  papers  the  state  space  is  continuous  in 
the  stochastic  problem. 
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I 


model  we  have  a piecewise  differentiable  battle  trajectory,  while  on  the 
other  hand  for  the  stochastic  model  we  have  a discontinuous  Markov  process 

k 

describing  the  force  levels.  Thus,  we  have  <^^(t ,x^  ,y)  for  the  deter- 
ministic formulation  with  x^,  x^,  and  y varying  continuously  over  time, 

* 

and  we  have  (t ,m^ ,n)  for  the  stochastic  formulation  with  , 

and  n restricted  to  be  non-negative  integers  and  casualties  occurring 

randomly  as  a Markov  Jump  process.  The  impossibility  of  directly  comparing 

* * 

<|>^(t  ,x^  ,x^  ,y)  and  (t  ,ra^  .m^  ,n)  continuously  over  time  should  be  apparent. 
Nevertheless,  we  can  still  qualitatively  compare  the  structures  of 

k 

the  two  policies.  There  is,  however,  a difficulty  in  that  (t  ,n) 

represents  a conditional  policy,  i.e.  the  optimal  policy  given  that  the 
system  is  in  state  (m^.ra^.n)  with  "backwards  time"  t remaining  in  the 
battle.  When  a state  transition  occurs  (randomly)  to  , then 

k 

the  optimal  policy  accordingly  becomes  (t ,m' ,m' ,n  ).  In  comparing  the 

b i L 

optimal  policies  this  should  be  taken  into  account,  since  it  does  not  seem 
appropriate  to  compare  cj>g  (t  ,m°  ,m°  .n^)  with  m° , m^,  and  n^  held  con- 

k 

stant  to  4>d(t ,x^ ,x2 ,y)  with  x^,  x2>  and  y changing  (continuously) 

over  time.  Since  for  the  stochastic  formulation  it  does  not  make  sense 
to  consider  an  "average"  optimal  policy  or  the  optimal  policy  for  "average" 
force  levels,  for  comparison  with  the  optimal  policy  for  the  deterministic 
formulation  we  have  considered  a realization  of  the  stochastic  attrition 
process  in  which  the  force  levels  are  always  "near  to"  those  of  the  corre- 

k 

sponding  deterministic  process.  In  other  words,  we  will  compare  (f^(t  ,x^  ,x.,  ,y) 

* 

to  4*5  (T  ,m2  ,n)  at  selected  values  of  x^  , x^,  and  y.  The  force  levels 

in  the  deterministic  model  are  rounded  to  integers  to  yield  the  values  ol 
m^ , m2 , and  n as  follows:  * ( x ^ ] + 1 (and  m^  = 0 when  x^  = 0) 


w 
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where  [x]  denotes  "the  greatest  integer  in  x,"  i.e.  [3.96]  = iJ1 
Moreover,  in  our  comparison  we  will  try  to  use  the  results  obtained  from 
the  deterministic  formulation  to  gain  insight  into  the  behavior  of  the 
optimal  policy  for  the  stochastic  control  problem,  in  other  wof'4':-,  we 
will  try  to  explain  results  from  the  stochastic  formulation  L j considering 
the  corresponding  behavior  for  the  deterministic  formulation. 

Numerical  results  have  been  generated  using  two  FORTRAN  programs 

is 

run  on  an  IBM  360-67  computer.  The  program  which  generates  $^(t ,x^ ,x^ ,y) 
(and  also  the  force  level  trajectories)  has  been  discussed  in  Section  4.5. 

it 

The  program  which  generates  $ (t,m  ,m„,n)  performs  the  computations 

w 1 4 

described  in  Section  5.3.  The  program  for  the  stochastic  formulation  is 
limited  by  computer  memory  requirements.  Results  for  all  force  levels  are 
retained  for  two  time  steps.  A battle  with  m°  =5,  m°  = 5,  and  n^  = 5 
requires  200,000  bytes  of  computer  memory,  and  this  increases  exponentially 
with  the  force  levels  as  Table  IV  indicates.  Thus,  most  runs  of  the  computer 
program  for  the  stochastic  formulation  have  been  with  the  above  as  the 
upper  limit  for  initial  force  levels,  although  we  have  run  one  case  with 
m°  = 9,  = 9,  and  n^  = 9 which  required  nearly  2,000,000  bytes  of 

memory . 

The  above  computer  programs  have  been  run  for  over  fifteen  different 
"parameter  sets,"  typical  examples  of  which  are  shown  in  Table  V.  In  all 
cases  we  have  chosen  parameter  values  so  that  a^b^  > a2^2"  °Pt^ma^ 

policies  fur  the  deterministic  and  the  stochastic  formulations  have  been 
compared  as  discussed  above.  The  results  of  these  comparisons  will  now  be 
sutnmar  i zed . 

II 


This  is  done  so  that  an  interval  process  (time  between  casualties)  of  the 
casualty  process  will  be  "similar"  in  the  deterministic  and  stochastic 
1 ormulat ions . 


i 
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Table  V.  Parameter  Sets  Used  to  Generate  Numerical 
Results  Given  in  Tables  VI  through  VIII. 


Parameter 


Set 

*1 

—2 

*1 

—2 

£ 

£ 

r 

1 

0.025 

0.015 

0.035 

0.005 

0.75 

2.25 

2.0 

2 

0.005 

0.003 

0.007 

0.001 

0.15 

0.45 

0.4 

3 

0.085 

0.080 

0.03 

0.03 

1.0 

2.0 

2.0 

For  all 

the  above 

parameter 

sets 

we  have 

albl  " 

a2b2 

and 

The  first  thing  to  be  pointed  out  is  that  the  optimal  fire  distribu- 

* 

tion  policy  for  both  formulations  has  the  property  that  p is  either 

0 or  1 (almost  everywhere  in  time)  For  the  deterministic  formulation, 

* 

we  have  shown  [34]  that  a singular  solution  is  impossible  and  that  $ 
must  be  0 or  1 except  for  at  most  one  point  in  time.  Although  we  have  noL 
proved  such  a result  for  the  stochastic  formulation,  we  have  never  encountered 
any  exception  to  it  in  all  our  numerical  computations.  As  we  have  discussed 
above,  two  cases  must  be  distinguished: 

Case  (1)  a^p  i a2q, 

Case  (2)  a^  < a2q. 

For  Case  (1):  a^p  i;  a^q,  the  optimal  policy  is  apparently  identical 

* #r 

for  both  formulations:  t ,x^  ,y)  = (t  ,m9  ,n)  = 1 for  > 0 

(or  m^  > 0) . We  recall  that  this  result  has  been  proved  for  the  determi- 
nistic formulation.  Although  a proof  has  not  been  found,  it  apparently 

is  also  true  for  the  stochastic  formulation.  No  exception  has  been  encoun- 
tered in  all  the  cases  for  which  numerical  determinations  have  been  made. 


See  [ 3b J for  a discussion  of  why  this  is  so  and  for  an  example  of  a similar 
problem  with  a different  attrition  process  for  which  may  take  on  an 

intermediate  value,  i.e.  0 < * < 1 (see  also  [381). 
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I 


For  Case  (2):  a^p  < a^q,  the  optimal  policies  are  similar  but  not 

identical.  The  basic  structures  are  apparently  essentially  the  same.  As 
discussed  above,  the  two  policies  have  been  compared  at  selected  points 
along  a deterministic  trajectory  by  considering  a corresponding  realization 
of  the  stochastic  process  obtained  by  rounding  the  deterministic  force 
levels.  The  time  of  such  a comparison  is  rounded  up  to  the  next  whole 
minute  in  the  case  of  the  occurrence  of  a casualty  and  to  the  next  0.01 
minute  in  the  case  of  a switch  in  fire  distribution.  Cases  corresponding 
to  over  ten  parameter  sets  have  been  considered;  illustrative  examples  of 
such  parameter  sets  are  shown  in  Table  V. 

In  Table  VI  we  show  some  typical  comparisons.  Although  not  shown 
in  Table  VI,  it  should  be  noted  that  in  all  the  cases  numerically  computed 

A 

♦s(T,mi,m  ,n)  had  the  property  that  for  constant  m^ , , and  n 

* * 

$<,  (t ,ml ,m2  »n)  = 0 for  0 S t < i1  and  (x .n^  ,m2 ,n)  = 1 for  ^ > T 

where  r denotes  the  "backwards  time."  In  Table  VI  we  show  the  optimal 

policies  for  the  two  formulations  for  two  parameter  sets.  The  optimal 

policies  are  given  at  discrete  points  in  time  following  the  above  discussion. 

These  times  correspond  to  a switching  time  in  one  of  the  formulations  or 

the  occurrence  of  a casualty  in  the  "typical"  realization  of  the  stochastic 

process.  The  deterministic  force  levels  x^ , x^,  and  y from  which 

ml  ’ '"2  ’ anc*  n have  been  determined  are  not  shown  in  Table  VI.  The 

optimal  returns  for  the  two  formulations  are  also  shown. 

The  results  shown  in  Table  VI  are  typical  and  indicate  (at  least 

for  all  the  cases  so  far  computed)  that  there  is  no  fundamental  difference 

between  the  structures  of  the  two  optimal  policies,  at  least  where  the 

deterministic  battle  does  not  terminate  prematurely,  i.e.  tr  “ t . 

r f max 

Thus , these  remarks  apply  to  cases  in  which  optimal  deterministic  trajectories 
lead  Lo  terminal  states  Sj > S^,  and  S^. 


Table  VI.  Comparisons  ot  Kesults  1 rom  Deterministic 


and  Stochastic  Optimal  Control  Problems 
(Deterministic  Trajectory  Leads  to  Terminal  State  SI) 


elapsed  lime,  t force  Levels 
(minutes) m^ m^,  n 


0 

13 

18 

31 

35.39 

41.28 

50=t  =t 

t max 


|~Parameter  Set  1 | 

Deterministic 

^p(L  *X1  ,X2  ,V)  °PLl-lna^  Return  4>*  ( t , ,m^  ,n)  S ( t .n^  ,tn  ,,n) 

1 -10.95  1 -8.93 

1 -10.95  1 -11. lb 

1 -10.95  1 -9.12 

1 -10.95  1 -10.9b 

1 -10.95  0 -10.79 

0 -10.95  0 -JO. >4 

0 -10.95  0 -10. OU 


Parameter  Set  2 


Elapsed  Time,  t Force  Levels  Deterministic 

(minutes) m^,  n <J>*  ( t ,x^  ,x^,  ,y)  Optimal  Return  4>£(t  ,m^  ,m0  ,n)  S (t  ,nt^  ,m  , *n2 


0 

27 

50 

55 

56 

56.38 

87 

100-t  =t 

f max 


-0.62 
-2.17 
-1.0  7 
-1 . 04 
-2.06 
-2.03 
-2.1b 
-2.05 


Parameter  Set  2 


Elapsed  Time,  t 
(minutes ) 

0 

5.61 

6.38 

26 

50=t  =t 

1 max 


Force  Levels 
m^  m.,  n 

5 5 5 

5 5 5 

5 5 5 

5 5 4 


4£li  riUia£i2aill 


A i 


The  reader  should  note  that  changes  somewhat  earlier  in  forward  time 

* 

trom  1 to  0 than  does  (at  least  for  the  realization  of  the  stochastic 

process  considered  here). 

In  cases  in  which  the  deterministic  battle  ends  prematurely  (i.e. 
the  optimal  trajectory  leads  to  , S,.,  , or  S^)  more  pronounced 

quantitative  differences  may  occur.  This  is  illustrated  by  the  cases  shown 
in  Table  VII.  As  noted  above,  the  deterministic  trajectory  determines  at 

it 

which  values  of  m.  , m , and  t we  look  at  This  should  explain 

1 4 J 

to  the  reader  why  the  stochastic  results  shown  in  Table  VII  are  not  realizable. 

Thus,  for  the  first  battle  shown  in  Table  VII,  a realization  of  t lie  stochasti 

battle  would  evolve  differently  (in  structure)  than  the  deterministic  battle 

due  to  this  difference  in  the  optimal  controls.  The  authors  feel  that  this 

is  due  to  the  fact  that  Y marginally  wins  the  deterministic  battle,  and 

thus  in  the  stochastic  model  there  Is  a fairly  good  probability^  at  t 

much  less  than  t that  Y will  lose  the  battle.  In  other  words,  there 

max 

are  some  possible  probabilistic  trajectories  which  yield  a reduced  payot 1 
to  Y.  These  are  weighted  in  the  stochastic  decision  process,  and  Y con- 
sequently follows  a more  conservative  policy  for  the  stochastic  formulation. 

For  the  case  of  the  first  battle  shown  in  Table  Vll,  Y essentially  gives 
up  his  chances  of  winning  to  guarantee  a given  level  ot  return.  i'his 
phenomenon  is  similar  to  the  "flypaper  effect"  noted  by  Whittle  |48J  in 
certain  stochastic  optimal  control  problems.  In  the  second  battle  shown, 

Y achieves  a clear-cnt  victory  in  the  deterministic  battle,  and  this 
phenomenon  does  not  occur. 


II 


* 

A transition  from  (m  ,m  ,n)  - (3,5,5)  to  (2,5,5)  is  impossible  when  i|>  - 0. 

n 1 2 h 

This  probability  has  not  been  explicitly  determined. 
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Table  VII.  Comparisons  of  Results  from  Deterministic 
and  Stochastic  Optimal  Control  Problems 
(Deterministic  Trajectory  Leads  to  Terminal  State  S7) 


Parameter  Set  3 


t 

max 


50  minutes 


Elapsed  Time,  t 
(minutes) 

0 

3 

5 

6 

8.59 

11 

13 

18 

21 

24 

31 


Force  Levels 
m^ m^  n 

3 5 5 

2 5 5 

2 5 4 

1 5 4 

0 5 4 

0 5 3 

0 4 3 

0 3 3 

0 3 2 

0 2 2 

0 12 

0 0 2 


1 

1 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 


_LL«i2  L jiR 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


| Parameter  Set  3 


Elapsed  Time,  t 
(minutes) 

0 

3 

5.04 

8 

11 

14 

14.11=t 


Force  Levels 
m^  n 

2 3 5 

13  5 

0 3 5 

0 2 5 

0 15 

0 14 

0 0 4 


% 4-  4- 

111 
111 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 


1 

0 

0 

0 

0 

0 

0 


0 

0 

0 

0 

0 

0 

0 


Note : 


.40* 

*s 


denotes 


<i*(t,m1,m2,n) 


computed  with  t 

max 


40  minutes. 
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In  addition,  in  oases  in  which  there  is  a premature  termination  in 
the  deterministic  formulation,  the  optimal  policy  for  Y in  the  correspond- 
ing stochastic  problem  is  affected  by  the  length  of  the  "perceived  planning 
horizon."  This  effect  is  shown  in  the  data  for  the  second  battle  of  Table 
VI 1 in  which  optimal  policies  are  given  for  stochastic  battles  of  varying 
lengths.  We  see  that  when  the  deterministic  battle  ends  near  to  the 
scheduled  end  of  the  stochastic  battle,  Y follows  a more  conservative 
pul icy  in  the  stochastic  battle.  Since  there  is  some  chance  that  Y cannot 
annihilate  the  X forces  in  the  "perceived  length  of  battle,"  he  follows 
a conservative  policy  of  firing  at  X^.  This  might,  in  fact,  explain  the 
results  for  the  first  battle.  Other  similar  phenomena  have  been  encountered 
in  cases  not  shown  here. 

Finally,  in  Table  VIII  we  show  that  the  optimal  policy  followed  by 
Y in  a realization  of  the  stochastic  combat  process  may  differ  appreciably 
from  that  for  the  deterministic  formulation  if  the  realization  does  not 
"follow"  the  deterministic  trajectory.  It  is  seen  that  4^,  may  repeatedly 
switch  back  and  forth  from  0 to  1 for  certain  realizations  of  the  stochastic 
process.  This  is  quite  different  than  the  corresponding  behavior  tor  the 
deterministic  version. 

7 . Discussion. 

In  this  section  we  discuss  what  we  have  learned  from  the  above  com- 
parison. First  and  foremost,  the  authors  feel  that  the  deterministic 
formulation  provides  more  insight  into  the  structure  of  the  optimal  fire 
distribution  policy.  The  explicit  dependence  of  the  optimal  control  upon 
various  parameter  groups  (these  are  (1)  R - a^b^/Ca^b^),  (2)  6 - a^p/ia^q). 
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Table  VIII.  One  Possible  Dependence  oi  Optimal  Stochastic 
Control  on  Realization  of  Casualties  in 
Stochastic  Lanchester  Attrition  1’rocess 
(Deterministic  Trajectory  Leads  to  Terminal  State  S 7 ; See  Table  VII.) 


| Parameter  Set  3 I t^  = 50  minutes 


Elapsed  Time,  t 
(minutes) 


0 

0.5 

0.7 

10.0 

15.0 

20.0 

23.55 

24.0 

25.0 

26.0 

30.0 

35.0 


Force  Levels 
m^ m,,  n 

3 5 5 

3 4 5 

3 3 5 

2 3 5 

2 3 4 

2 2 4 

2 2 4 

2 2 3 

2 1 3 

1 1 3 

112 
10  2 


4>g  ( t ,11^  ,m2  ,n) 

0 

0 

1 

1 

0 

1 

0 

0 

1 

1 

0 

1 
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and  (3) 


r AI". 

“ • ;/r  1 


is  readily  obtained  for  the  deterministic  optimal 


control  problem.  This  has  not  been  true  for  the  stochastic  problem  for 
which  only  the  dependence  upon  6 has  been  analytically  obtained. 

Let  us  now  summarize  the  observed  differences  and  similarities 
between  the  structures  of  the  optimal  policies  for  the  deterministic  and 
stochastic  formulations.  The  similarities  are:  (1)  optimal  policy  always 

0 or  i,  (2)  same  parameter  groups  (R,S,  and  a)  upon  which  optimal 

policy  depends,  (3)  optimal  policy  dependent  upon  force  levels  and 

* 

whether  Y wins  or  loses,  (4)  in  both  models  $ =1  for  x^  > 0 when 

* 

6^1  and  R > 1,  and  (5)  $ =0  for  t £ (T-x^.T]  when  0 s 6 < 1 < R; 
furthermore  x =*  x^(a)!^  The  differences  are:  (1)  in  the  stochastic 

formulation  the  optimal  policy  actually  implemented  (i.e.  followed)  in  a 
battle  depends  upon  the  battle's  probabilistic  (forward)  evolution  (i.e. 
the  realization  of  the  stochastic  process)  and  the  time  remaining  in  the 
prescribed  duration  battle,  and  (2)  x^  is  "greater  In  the  stochastic 
model"  except  for  cases  corresponding  to  premature  termination  in  the 
deterministic  battle.  Overall,  we  feel  that  an  understanding  of  the 
structure  of  an  optimal  policy  is  best  developed  by  considering  the 
deterministic  version  of  such  a combat  problem.  For  problems  too  complex 
for  analytic  treatment,  rules  of  thumb  for  approximating  an  optimal  policy 
are  probably  best  obtained  from  deterministic  formulations. 


In  [34]  and  [36]  one  can  find  further  discussion  of  the  structure  of  the 
optimal  policy,  including  interpretation  of  such  parameter  groups.  The  reader 
may  find  the  following  interpretations  useful  for  understanding  the  solution 
to  the  problem  studied  in  the  paper  at  hand.  The  quantity  a may  be  thought 
of  as  the  rate  of  destroying  X^'s  Rill  capability  against  i.  It  is  a measure 
of  strategic  (long  run)  return.  The  quantity  a p represents  the  rate  of  de- 
struction of  Xj  value  by  Y at  the  end  of  battle.  Thus,  it  represents  short 
run  return.  The  quantity  ri^b^  reflects  the  loss  of  Y value  at  the  end  ol 
battle  so  that  ot  measures  the  loss  of  Y value  relative  to  that  of  X at 
the  end  of  battle. 

^Moreover,  x.  depends  upon  m. .m^,  and  n in  the  stochastic  optimal  control 
problem.  ^ 


4 8 


Finally,  we  would  like  to  point  out  that  there  is  a circumstance 
under  which  the  stochastic  formulation  is  to  be  preferred  over  the  deter- 
ministic one.  This  is,  namely,  when  there  is  a small  number  (approximately 
three  or  under)  of  each  combatant  type.  As  noted  above,  obtaining  a 
numerical  approximate  solution  to  the  optimal  stochastic  control  problem 
Is  limited  to  small  numbers  of  combatants  due  to  computer  memory  require- 
ments.^ In  such  cases,  however,  of  small  numbers  of  combatants  (and  a 
stochastic  attrition  process)  , the  stochastic  formulation  as  a Markov  chain 
is  to  be  preferred  when  the  required  computer  resources  are  available  for 
the  obvious  reason  that  the  deterministic  differential  equation  model 
cannot  adequately  describe  the  situation.  This  point  made  comparison  of 
results  from  the  two  formulations  difficult. 

8 . Implications  for  Defense  Planners. 

The  authors  feel  that  the  study  of  even  the  very  simplest  abstractions 
(idealizations)  of  tactical  allocation  structures  as  considered  in  this 
paper  has  yielded  significant  implications  for  defense  planners  and 
military  operations  analysts.  First  and  foremost  is  the  fact  that  study 
of  such  deterministic  optimal  control  problems  provides  much  more  insight 
into  the  structure  of  optimal  allocation  policies  than  corresponding  stochastic 
formulations.  We  feel  that  such  deterministic  formulations  provide  a better 
understanding  of  the  effects  of  modelling  assumptions  on  optimal  military 
strategies  derived  from  the  mathematical  models.  This  is,  of  course, 
essential  for  determining  optimal  (or  near-optimal)  solutions  to  real  world 
problems  that  are  far  too  complex  to  be  solved  by  exact  analytic  methods. 

^ These  grow  exponentially  as  force  levels  increase  because  of  the  way  in 
which  a solution  must  be  "built  up."  See  Figure  1 and  Table  IV  tor  illus- 
trations of  this  point. 
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Moreover,  one  might  apply  general  principles  or  rules  oi  thumb  developed 
from  the  study  of  such  idealizations  to  higher  resolution  studies  which, 
for  example,  might  use  computer  simulation  methods. 

The  study  of  the  deterministic  optimal  control  problem  (7)  in  this 
paper  yields  several  significant  results  which  should  be  kept  in  mind  by 
practitioners  who  perform  more  detailed  computer  simulation  studies. 

These  are 

(1)  Force  levels  do  affect  optimal  strategies.  Whether  one  "wins"  or 
"loses"  affects  optimal  strategies. 

(2)  Even  the  nature  of  the  scenario  (terminal  control  or  prescribed  dura- 
tion conflict)  may  affect  optimal  strategies.  This,  if  one  develops 
"good"  tactics  for  a 90  day  compaign,  such  tactics  need  not  be  "good" 
if  the  conflict  does  not  terminate  at  the  prescribed  time. 

(3)  The  nature  of  the  attrition  process  has  a significant  effect  upon 
optimal  strategies.^ 

Finally,  the  authors  feel  that  the  above  results  indicate  that  more 
basic  research  should  be  done  on  the  termination  of  battles  and  wars^  as 
well  as  combat  attrition  theories.  The  demonstrated  sensitivity  of  results 
obtained  from  optimization  problems  like  the  one  considered  here  shows 
this . 


^ Tli is  result  has  been  pointed  out  elsewhere  [36],  [38]  and  is  partially 
based  on  the  study  of  a similar  problem  [38]. 

«n 

Some  work  has  been  done  in  this  direction  [14],  [33],  [46],  [47],  although  it 
does  not  appear  to  be  widely  known  among  practicing  analysts. 


APPENDIX. 


Explanation  of  Notation. 
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Tiie  symbols  which  are  used  in  this  paper  are  defined  as  follows: 


a j .a, ,b^ ,b0  » constant  attrition-rate  coefficients, 


A(m,n) ,B(m,n)  = attrition  rates  of  X and  Y forces,  respectively,  in 
stochastic  battle;  it  should  be  noted  that 


Prob 


r. 


one  X 
^interval 


casualty  in  time 
from  t to  t + At 


A(m,n)At , 


E 

m,T 


l*]  = conditional  expectation  (mathematical  expectation  of  quantity 
in  brackets  at  j = 0 given  that  at  t we  have  m(t)  = 
(m^-r)  ,m2(t)  ,n(x)))  , 


H = Hamiltonian  function. 


(t) .M^Ct) ,N(t)  = the  numbers  (a  random  variable)  of  X^ , X^,  and  Y 

combatants,  respectively,  at  time  t. 


mj,m,;,n  = realizations  of  the  random  variables  M^(t),  M.^t),  and  N(t) 

initial  values  denoted  as  m° , m° , nQ, 

p,q,r  = utilities  assigned  to  surviving  , X^  and  Y forces 
respectively, 

p. (t)  for  i = 1,2,3,  = dual  variable  corresponding  to  x.(t) 

(x^Ct)  = y (t))  , 

= (P1,P2»P3)  (a  vector), 

P(t,m,n)  = Prob[M(t)=m,N(t)=n]  = state  probability, 

P = (XpX^y^)  - point  in  the  initial  state  space, 

R = a^b^ / (a^b  ,)  , 


S (t gnij  ,m1  ,n)  = optimal  expected 

S - numerical  approximation  to 

S for  1 - 1,...,8  - the  1— 
1 in  Table 


value  function, 

S(T,m1,mJ,n) , 

part  of  the  terminal 
I. 


surlace  as  detined 
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s = s(x1,x2>  - b^x^  + d2x2* 
t * time  after  beginning  of  battle, 

t^  = time  at  which  is  annihilated,  i.e.  x^(t^)  = 0, 

t?  = first  time  at  which  2b  x (t,)Xj  + b„(x°)2  “ a„y2(t  ) for  an 
extremal  leading  to  S^, 

t..  = last  time  at  which  fire  is  directed  at  X..  for  an  extremal  leading 
to  S3, 

t,  = time  at  which  X2  is  annihilated  (before  X^)  , i.e.  x2(t^)  ■ 0, 

for  an  extremal  leading  to  SQ, 

o 

tf  = time  at  which  battle  ends, 

t = maximum  possible  duration  for  battle,  i.e.  tr  S t , 
max  r f max 

v = v(t)  = a2P2(r)  - a1P1(x), 

W(T,m^,m2,n)  = "switching  function"  defined  by  equation  (44), 

o o 

x1,x9,y  = average  force  strengths;  with  initial  values  x^.x^yg, 


z = cosh»/a2b2  (S^) 


qv  a_ 


R-6 
R-l  * 


6 = a^p/ (a2q)  , 


nt(t)  for  i = 1,2,  = multiplier  corresponding  to  state  variable 

inequality  constraint  x^  i 0, 


v ^ tor  i = 1,2, 


multiplier  corresponding  to  state  variable  terminal 
inequality  constraint  x^(T)  i 0, 


4 (4  ) 

n s' 


fraction  of  Y-fire  directed  at  X^  in  deterministic 
formulation;  extremaL  and  optimal  controls  denoted  as 


(slochast ic ) 


‘iJ. 


♦ = { 0 , At  » ~iy . • • • » , 1 ) = set  of  admissible  controls  in  stochastic 
' ' problem, 

T = "backwards  time"  from  the  end  of  battle  defined  by  t = t^  - t,  i.e. 
the  time  remaining  before  the  end  of  battle. 


T (S.)  = "backwards  time"  of  the  first  switch  in  tactics  for  extremals 
leading  to  S^. 


Additionally,  remarks  similar  to  those  for  x^(S^)  above  apply  to 


tl(Si),  tf(si),  etc. 
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